We propose a routing algorithm that takes a sequence of vectors and computes a new sequence with specified length and vector size. Each output vector maximizes "bang per bit," the difference between a net benefit to use and net cost to ignore data, by better predicting the input vectors. We describe output vectors as geometric objects, as latent variables that assign credit, as query states in a model of associative memory, and as agents in a model of a Society of Mind. We implement the algorithm with optimizations that reduce parameter count, computation, and memory use by orders of magnitude, enabling us to route sequences of greater length than previously possible. We evaluate our implementation on natural language and visual classification tasks, obtaining competitive or state-of-the-art accuracy and end-to-end credit assignments that are interpretable.
translated by 谷歌翻译
我们提出的方法可以并联有效分类分类。我们的方法将与语义树中给定的节点相对应的一批分类分数和标签转换为与与祖先路径中所有节点相对应的分数和标签硬件加速器。我们在当前的硬件加速器上实现了我们的方法,并用一棵树结合了WordNet 3.0中的所有英语综合体,涵盖了20级的深度,涵盖117,659个类。我们将一批分数和标签转换为各自的祖先路径,从而产生可忽略不计的计算,并且在数据的足迹上仅消耗固定的0.04GB内存。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
The upcoming exascale era will provide a new generation of physics simulations. These simulations will have a high spatiotemporal resolution, which will impact the training of machine learning models since storing a high amount of simulation data on disk is nearly impossible. Therefore, we need to rethink the training of machine learning models for simulations for the upcoming exascale era. This work presents an approach that trains a neural network concurrently to a running simulation without storing data on a disk. The training pipeline accesses the training data by in-memory streaming. Furthermore, we apply methods from the domain of continual learning to enhance the generalization of the model. We tested our pipeline on the training of a 3d autoencoder trained concurrently to laser wakefield acceleration particle-in-cell simulation. Furthermore, we experimented with various continual learning methods and their effect on the generalization.
translated by 谷歌翻译
知识图已成为以人类和机器可解开方式管理和标准化半结构域知识的有效工具。在基于图的域应用程序(例如嵌入式和图形神经网络)方面,当前的研究越来越多地考虑到图表中编码的信息的时间相关的演变。扩展了固定和静态知识图的算法和模型,以使其适合时间感知域,其中可以以不同的方式解释时间意识。特别是,有效期和事实的可追溯性是与时间相关的知识图扩展的目标之间的区别。在这种情况下,在文献中通常不一致或互换地使用术语和定义,例如动态和时间。因此,借助本文,我们旨在提供时间吸引的知识图形扩展的简短但定义明确的概述,从而促进该领域的未来研究。
translated by 谷歌翻译
基于深度学习的解决方案正在为各种应用程序成功实施。最值得注意的是,临床用例已增加了兴趣,并且是过去几年提出的一些尖端数据驱动算法背后的主要驱动力。对于诸如稀疏视图重建等应用,其中测量数据的量很少,以使获取时间短而且辐射剂量较低,降低了串联的伪像,促使数据驱动的DeNoINEDENO算法的开发,其主要目标是获得获得的主要目标。只有一个全扫描数据的子集诊断可行的图像。我们提出了WNET,这是一个数据驱动的双域denoising模型,其中包含用于稀疏视图deNoising的可训练的重建层。两个编码器 - 模型网络同时在正式和重建域中执行deno,而实现过滤后的反向投影算法的第三层则夹在前两种之间,并照顾重建操作。我们研究了该网络在稀疏视图胸部CT扫描上的性能,并突出显示了比更传统的固定层具有可训练的重建层的额外好处。我们在两个临床相关的数据集上训练和测试我们的网络,并将获得的结果与三种不同类型的稀疏视图CT CT DeNoisis和重建算法进行了比较。
translated by 谷歌翻译
GitHub Copilot,由大规模语言模型Codex提供支持的Visual Studio代码开发环境的扩展,为软件开发人员提供自动程序合成。该模型在深度学习领域中已经广泛研究,然而,与遗传编程的比较尚未以自动编程合成的性能所知。在本文中,我们在标准程序综合基准问题上评估GitHub CopIlot,并将与遗传编程文献中的结果进行比较。此外,我们讨论了两种方法的性能。我们发现,在基准问题上的两种方法的性能非常相似,但与GitHub Copilot相比,基于遗传编程的程序合成方法尚未成熟,以支持实际软件开发中的程序员。遗传编程通常需要大量昂贵的手工标记训练箱,并且需要太多时间来产生解决方案。此外,由遗传编程方法产生的源代码通常是膨胀和难以理解的。对于未来的遗传编程综合的工作,我们建议研究人员,专注于提高执行时间,可读性和可用性。
translated by 谷歌翻译
过度参数化的模型可以完美地学习各种类型的数据分布,但是,与人工数据相比,实际数据的概括误差通常较低。这表明数据分布的属性对概括能力有影响。这项工作的重点是由输入数据定义的搜索空间,并假设相邻输入值的标签之间的相关性影响概括。如果相关性较低,则输入数据空间的随机性很高,导致高概括误差。我们建议使用Maurer的Universal测量输入数据空间的随机性。合成分类任务和常见图像分类基准的结果(MNIST,CIFAR10和Microsoft的猫与狗数据集)在输入数据空间的随机性与二进制分类问题的深神经网络的一般性误差之间找到了很高的相关性。
translated by 谷歌翻译
我们简要介绍了从实验神经科学的研究结果对生物学学习的共同假设,并以经常性神经网络的梯度学习效率对比。本评论中讨论的关键问题包括:突触可塑性,神经电路,理论实验划分和客观功能。我们在设计新的研究时,我们的建议与理论和实验神经科学家的建议有助于为这些问题带来清晰度。
translated by 谷歌翻译
自主车辆的环境感知受其物理传感器范围和算法性能的限制,以及通过降低其对正在进行的交通状况的理解的闭塞。这不仅构成了对安全和限制驾驶速度的重大威胁,而且它也可能导致不方便的动作。智能基础设施系统可以帮助缓解这些问题。智能基础设施系统可以通过在当前交通情况的数字模型的形式提供关于其周围环境的额外详细信息,填补了车辆的感知中的差距并扩展了其视野。数字双胞胎。然而,这种系统的详细描述和工作原型表明其可行性稀缺。在本文中,我们提出了一种硬件和软件架构,可实现这样一个可靠的智能基础架构系统。我们在现实世界中实施了该系统,并展示了它能够创建一个准确的延伸高速公路延伸的数字双胞胎,从而提高了自主车辆超越其车载传感器的极限的感知。此外,我们通过使用空中图像和地球观测方法来评估数字双胞胎的准确性和可靠性,用于产生地面真理数据。
translated by 谷歌翻译